Skip to content

Conversation

@nguidotti
Copy link
Contributor

@nguidotti nguidotti commented Aug 15, 2025

This PR introduces the following changes:

  • Implements a simple diving procedure
  • Allows the branch-and-bound to switch between different search strategies: BEST_FIRST, DEPTH_FIRST and MULTITHREADED_BEST_FIRST_WITH_DIVING
  • Refactor the branch-and-bound code such that the solve function is now organized into separated methods
  • Moved some commonly used variables to be member variables in the branch-and-bound solver.

@nguidotti nguidotti self-assigned this Aug 15, 2025
@nguidotti nguidotti requested a review from a team as a code owner August 15, 2025 13:25
@nguidotti nguidotti added improvement Improves an existing functionality mip labels Aug 15, 2025
@copy-pr-bot
Copy link

copy-pr-bot bot commented Aug 15, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@nguidotti nguidotti added the non-breaking Introduces a non-breaking change label Aug 15, 2025
@anandhkb anandhkb added this to the 25.10 milestone Aug 18, 2025
@nguidotti
Copy link
Contributor Author

/ok to test f9fb5ad

@nguidotti
Copy link
Contributor Author

/ok to test 041e104

@nguidotti
Copy link
Contributor Author

/ok to test dd7e340

? (lower_bound == 0.0 ? 0.0 : std::numeric_limits<f_t>::infinity())
: std::abs(obj_value - lower_bound) / std::abs(obj_value);
if (user_mip_gap != user_mip_gap) { return std::numeric_limits<f_t>::infinity(); }
// Handle NaNs (i.e., NaN != NaN)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test user_mip_gap != user_mip_gap is equivalent to std::is_nan

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change was suggested by Alice here

i_t nodes_explored = 0;

while (node_stack.size() > 0) {
repair_heuristic_solutions(lower_bound, solution);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need to repair a heuristic solution in the dive?

I would let the best-first search thread handle repairs.

lower_bound = lower_bound_ = root_node.lower_bound;
mutex_lower.unlock();
gap = get_upper_bound() - lower_bound;
if (settings_.bnb_search_strategy == search_strategy_t::DEPTH_FIRST) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are in a dive. Don't we know what the search strategy is?

stats_.nodes_unexplored = 0;
stats_.num_nodes = 1;

if (settings_.bnb_search_strategy == search_strategy_t::DEPTH_FIRST) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would the search strategy ever be depth first here?

I think we only want best first or multithreaded best first with diving

Copy link
Contributor Author

@nguidotti nguidotti Sep 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@akifcorduk wanted to have a diving only option for submip (for this reason, I have the checks above). However, for simplicity sake, we have just a single search strategy: bfs + diving. Is that OK?


enum class search_strategy_t {
BEST_FIRST = 0,
DEPTH_FIRST = 1,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove DEPTH_FIRST?

mutex_gap_.unlock();
}
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To confirm: in a multithreaded bfs + dives, the nodes explored and unexplored only comes from the BFS thread?

Copy link
Contributor

@chris-maes chris-maes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for simplifying the PR. I left a few more comments. These are mostly nitpicks.

The only change I would suggest making before merging is to remove repairing solutions from the diving threads.

You should make sure to run the MIPLIB benchmarks and put the geomean performance improvement in the commit message. You should also run this on a MIP problem that is infeasible from the LP relaxation and one that is infeasible due to no integer solution and verify the code correctly handles these cases.

@nguidotti
Copy link
Contributor Author

/ok to test 413ee24

Signed-off-by: nicolas <nguidotti@nvidia.com>
@nguidotti
Copy link
Contributor Author

/ok to test 53127f5

@nguidotti
Copy link
Contributor Author

/ok to test 772f49e

@nguidotti
Copy link
Contributor Author

nguidotti commented Sep 19, 2025

I got the following results for the MIPLIB2017 benchmark (arithmetic means of the primal gap):

branch-25.10 (3dc71e7): 0.233632
diving (#305): 0.217486

i.e., a 1.6% improvement over the main branch.

The neos859080 instance is integer infeasible, which the solver detects correctly (the log is below). There is an unit test that check if the MIP problem that is infeasible from the root LP relaxation (DOC_EXAMPLE_TEST).

cuOpt version: 25.10.0, git hash: ef90a38, host arch: x86_64, device archs: 90a-real
CPU: Intel(R) Xeon(R) Platinum 8480C, threads (physical/logical): 112/224, RAM: 1749.73 GiB
CUDA 13.0, device: NVIDIA H200 (ID 0), VRAM: 139.80 GiB
CUDA device UUID: ffffffdbffffff8effffffe3ffffffe5-fff

Unpresolved problem:: 164 constraints, 160 variables, 1280 nonzeros
Presolve status:: reduced the problem
Presolve removed:: 4 constraints, 0 variables, 65 nonzeros
Presolved problem:: 160 constraints, 160 variables, 1215 nonzeros
Third party presolve time: 0.164001
Solving a problem with 160 constraints 160 variables (160 integers) and 1215 nonzeros
Objective offset 0.000000 scaling_factor 1.000000
Running presolve!
After trivial presolve #constraints 160 #variables 160 objective offset 0.000000.
Solving LP root relaxation
Scaling matrix. Maximum column norm 1.187913e+00
Dual Simplex Phase 1
Dual feasible solution found.
Dual Simplex Phase 2
 Iter     Objective           Num Inf.  Sum Inf.     Perturb  Time
    1 +3.3333333333334364e-01      97 7.30512933e+01 0.00e+00 0.00

Root relaxation solution found in 105 iterations and 0.00s
Root relaxation objective +1.00000000e+00

Strong branching on 4 fractional variables
| Explored | Unexplored | Objective   |    Bound    | Depth | Iter/Node |  Gap   |    Time 
        1        1                +inf  +1.000000e+00      1   0.0e+00       -        0.00
Infeasible after bounds strengthening. Fathoming node 727.
Infeasible after bounds strengthening. Fathoming node 729.
Infeasible after bounds strengthening. Fathoming node 733.
Infeasible after bounds strengthening. Fathoming node 737.
Infeasible after bounds strengthening. Fathoming node 736.
Infeasible after bounds strengthening. Fathoming node 743.
Infeasible after bounds strengthening. Fathoming node 939.
Infeasible after bounds strengthening. Fathoming node 919.
Infeasible after bounds strengthening. Fathoming node 1085.
Infeasible after bounds strengthening. Fathoming node 1109.
Infeasible after bounds strengthening. Fathoming node 1135.
Infeasible after bounds strengthening. Fathoming node 1174.
Infeasible after bounds strengthening. Fathoming node 1341.
Infeasible after bounds strengthening. Fathoming node 1373.
Infeasible after bounds strengthening. Fathoming node 1399.
Infeasible after bounds strengthening. Fathoming node 1405.
Infeasible after bounds strengthening. Fathoming node 1435.
Infeasible after bounds strengthening. Fathoming node 1439.
Infeasible after bounds strengthening. Fathoming node 1441.
Infeasible after bounds strengthening. Fathoming node 1440.
Infeasible after bounds strengthening. Fathoming node 1471.
Infeasible after bounds strengthening. Fathoming node 1485.
Infeasible after bounds strengthening. Fathoming node 1649.
Infeasible after bounds strengthening. Fathoming node 1713.
Infeasible after bounds strengthening. Fathoming node 1715.
Infeasible after bounds strengthening. Fathoming node 1847.
Infeasible after bounds strengthening. Fathoming node 1859.
Infeasible after bounds strengthening. Fathoming node 1989.
Infeasible after bounds strengthening. Fathoming node 2059.
Infeasible after bounds strengthening. Fathoming node 2105.
Infeasible after bounds strengthening. Fathoming node 2125.
Infeasible after bounds strengthening. Fathoming node 2187.
Infeasible after bounds strengthening. Fathoming node 2269.
Infeasible after bounds strengthening. Fathoming node 2261.
Numerical issue node 2305. Resolving from scratch.
Infeasible after bounds strengthening. Fathoming node 2463.
Infeasible after bounds strengthening. Fathoming node 2467.
Infeasible after bounds strengthening. Fathoming node 2503.
Infeasible after bounds strengthening. Fathoming node 2625.
Infeasible after bounds strengthening. Fathoming node 2651.
Infeasible after bounds strengthening. Fathoming node 2652.
Infeasible after bounds strengthening. Fathoming node 2653.
Infeasible after bounds strengthening. Fathoming node 2731.
Infeasible after bounds strengthening. Fathoming node 2746.
Infeasible after bounds strengthening. Fathoming node 2747.
Infeasible after bounds strengthening. Fathoming node 2743.
Infeasible after bounds strengthening. Fathoming node 2929.
Infeasible after bounds strengthening. Fathoming node 3013.
Infeasible after bounds strengthening. Fathoming node 3059.
Infeasible after bounds strengthening. Fathoming node 3065.
Infeasible after bounds strengthening. Fathoming node 3087.
Infeasible after bounds strengthening. Fathoming node 3121.
Infeasible after bounds strengthening. Fathoming node 1563.
Infeasible after bounds strengthening. Fathoming node 3161.
Infeasible after bounds strengthening. Fathoming node 3155.
Infeasible after bounds strengthening. Fathoming node 3215.
Infeasible after bounds strengthening. Fathoming node 3281.
Infeasible after bounds strengthening. Fathoming node 3373.
Infeasible after bounds strengthening. Fathoming node 3489.
Infeasible after bounds strengthening. Fathoming node 3491.
     2000       54                +inf  +1.000000e+00     40   2.4e+01       -        1.03
Infeasible after bounds strengthening. Fathoming node 3753.
Infeasible after bounds strengthening. Fathoming node 3779.
Infeasible after bounds strengthening. Fathoming node 3843.
Infeasible after bounds strengthening. Fathoming node 3859.
Infeasible after bounds strengthening. Fathoming node 3861.
Infeasible after bounds strengthening. Fathoming node 3947.
Infeasible after bounds strengthening. Fathoming node 3985.
Infeasible after bounds strengthening. Fathoming node 4073.
Infeasible after bounds strengthening. Fathoming node 4125.
Infeasible after bounds strengthening. Fathoming node 4091.
Infeasible after bounds strengthening. Fathoming node 4249.
Infeasible after bounds strengthening. Fathoming node 4339.
Infeasible after bounds strengthening. Fathoming node 4421.
Infeasible after bounds strengthening. Fathoming node 4422.
Infeasible after bounds strengthening. Fathoming node 4423.
Infeasible after bounds strengthening. Fathoming node 4419.
Infeasible after bounds strengthening. Fathoming node 4427.
Infeasible after bounds strengthening. Fathoming node 4459.
Infeasible after bounds strengthening. Fathoming node 4533.
Infeasible after bounds strengthening. Fathoming node 4653.
Infeasible after bounds strengthening. Fathoming node 4361.
Infeasible after bounds strengthening. Fathoming node 4825.
Infeasible after bounds strengthening. Fathoming node 4865.
Infeasible after bounds strengthening. Fathoming node 4893.
Infeasible after bounds strengthening. Fathoming node 5025.
Infeasible after bounds strengthening. Fathoming node 5115.
Infeasible after bounds strengthening. Fathoming node 5141.
Explored 3688 nodes in 1.76s.
Absolute Gap -nan Objective inf Lower Bound inf
Integer infeasible.

@chris-maes
Copy link
Contributor

/merge

@rapids-bot rapids-bot bot merged commit 53d6e74 into NVIDIA:branch-25.10 Sep 19, 2025
201 of 202 checks passed
copy-pr-bot bot pushed a commit that referenced this pull request Sep 22, 2025
This PR introduces the following changes:
- Implements a simple diving procedure
- Allows the branch-and-bound to switch between different search strategies: `BEST_FIRST`, `DEPTH_FIRST` and `MULTITHREADED_BEST_FIRST_WITH_DIVING`
- Refactor the branch-and-bound code such that the `solve` function is now organized into separated methods
- Moved some commonly used variables to be member variables in the branch-and-bound solver.

Authors:
  - Nicolas L. Guidotti (https://github.com/nguidotti)
  - Ramakrishnap (https://github.com/rgsl888prabhu)
  - https://github.com/ahehn-nv

Approvers:
  - Gil Forsyth (https://github.com/gforsyth)
  - Akif ÇÖRDÜK (https://github.com/akifcorduk)
  - Trevor McKay (https://github.com/tmckayus)
  - Chris Maes (https://github.com/chris-maes)

URL: #305
@nguidotti nguidotti mentioned this pull request Sep 23, 2025
8 tasks
rapids-bot bot pushed a commit that referenced this pull request Oct 3, 2025
This PR implement a parallel branch-and-bound procedure, which is split into two phases. In the first phase, the algorithm will greedily expand the search tree until a certain depth and then add the bottom nodes to a global heap. The parallel expansion is implemented using  `omp task`.

In the second phase, some threads will explore the tree using best first search with plunging, i.e., they take the first node from the global heap and then explore the entire branch that starts on this node. Any unexplored node are insert into the heap. The remaining threads will perform deep dives in order to find feasible solutions. The solver keep a small heap contains the most promising nodes to perform the dives, which is keep in sync with the global heap. 

This PR also
- Replace the `std::thread`-based parallelization in the strong branching with OpenMP in order to use dynamic scheduling. This ensures that all threads have similar amount of work and improve parallel performance.
- Fixed invalid memory access when trying to access the status of a fathomed node.   
- Replaced `std::mutex` with `omp atomic` whatever applicable. 
- Added dedicated classes  `dive_queue_t` and `search_tree_t` to store the diving heap and the search tree, respectively.

This is an extension of #305.
Closes #320.
Closes #417.

## Benchmark results (MIPLIB2017):

master branch (53d6e74)
```
Average Gap: 0.2174861712
```

This PR:
```
Average Gap: 0.1989485546
```

i.e., a `1.8%` improvement. In terms of the geomean of the gap ratio, this is equal to `1.62x`.

Authors:
  - Nicolas L. Guidotti (https://github.com/nguidotti)

Approvers:
  - Rajesh Gandham (https://github.com/rg20)
  - Chris Maes (https://github.com/chris-maes)

URL: #412
@nguidotti nguidotti deleted the diving branch October 9, 2025 09:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

improvement Improves an existing functionality mip non-breaking Introduces a non-breaking change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants